NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Investigating the sources of variable impact of pathogenic variants in monogenic metabolic conditions

https://doi.org/10.1038/s41467-025-60339-7

Wei, Angela; Border, Richard; Fu, Boyang; Cullina, Sinéad; Brandes, Nadav; Jang, Seon-Kyeong; Sankararaman, Sriram; Kenny, Eimear E; Udler, Miriam S; Ntranos, Vasilis; et al (December 2025, Nature Communications)

Abstract Over three percent of people carry a dominant pathogenic variant, yet only a fraction of carriers develop disease. Disease phenotypes from carriers of variants in the same gene range from mild to severe. Here, we investigate underlying mechanisms for this heterogeneity: variable variant effect sizes, carrier polygenic backgrounds, and modulation of carrier effect by genetic background (marginal epistasis). We leveraged exomes and clinical phenotypes from the UK Biobank and the Mt. Sinai BioMeBiobank to identify carriers of pathogenic variants affecting cardiometabolic traits. We employed recently developed methods to study these cohorts, observing strong statistical support and clinical translational potential for all three mechanisms of variable carrier penetrance and disease severity. For example, scores from our recent model of variant pathogenicity were tightly correlated with phenotype amongst clinical variant carriers, they predicted effects of variants of unknown significance, and they distinguished gain- from loss-of-function variants. We also found that polygenic scores modify phenotypes amongst pathogenic carriers and that genetic background additionally alters the effects of pathogenic variants through interactions.
more » « less
Free, publicly-accessible full text available December 1, 2026
A scalable and robust variance components method reveals insights into the architecture of gene-environment interactions underlying complex traits

https://doi.org/10.1016/j.ajhg.2024.05.015

Pazokitoroudi, Ali; Liu, Zhengtong; Dahl, Andrew; Zaitlen, Noah; Rosset, Saharon; Sankararaman, Sriram (July 2024, The American Journal of Human Genetics)

Full Text Available
Characterizing the genetic architecture of drug response using gene-context interaction methods

https://doi.org/10.1016/j.xgen.2024.100722

Sadowski, Michal; Thompson, Mike; Mefford, Joel; Haldar, Tanushree; Oni-Orisan, Akinyemi; Border, Richard; Pazokitoroudi, Ali; Cai, Na; Ayroles, Julien F; Sankararaman, Sriram; et al (December 2024, Cell Genomics)

Full Text Available
Leveraging pleiotropy for joint analysis of genome-wide association studies with per trait interpretations

https://doi.org/10.1371/journal.pgen.1010447

Taraszka, Kodi; Zaitlen, Noah; Eskin, Eleazar (November 2022, PLOS Genetics)
Epstein, Michael P. (Ed.)
We introduce pleiotropic association test (PAT) for joint analysis of multiple traits using genome-wide association study (GWAS) summary statistics. The method utilizes the decomposition of phenotypic covariation into genetic and environmental components to create a likelihood ratio test statistic for each genetic variant. Though PAT does not directly interpret which trait(s) drive the association, a per trait interpretation of the omnibus p-value is provided through an extension to the meta-analysis framework, m-values. In simulations, we show PAT controls the false positive rate, increases statistical power, and is robust to model misspecifications of genetic effect. Additionally, simulations comparing PAT to three multi-trait methods, HIPO, MTAG, and ASSET, show PAT identified 15.3% more omnibus associations over the next best method. When these associations were interpreted on a per trait level using m-values, PAT had 37.5% more true per trait interpretations with a 0.92% false positive assignment rate. When analyzing four traits from the UK Biobank, PAT discovered 22,095 novel variants. Through the m-values interpretation framework, the number of per trait associations for two traits were almost tripled and were nearly doubled for another trait relative to the original single trait GWAS.
more » « less
Full Text Available
Deep learning-based phenotype imputation on population-scale biobank data increases genetic discoveries

https://doi.org/10.1038/s41588-023-01558-w

An, Ulzee; Pazokitoroudi, Ali; Alvarez, Marcus; Huang, Lianyun; Bacanu, Silviu; Schork, Andrew J.; Kendler, Kenneth; Pajukanta, Päivi; Flint, Jonathan; Zaitlen, Noah; et al (November 2023, Nature Genetics)

Abstract Biobanks that collect deep phenotypic and genomic data across many individuals have emerged as a key resource in human genetics. However, phenotypes in biobanks are often missing across many individuals, limiting their utility. We propose AutoComplete, a deep learning-based imputation method to impute or ‘fill-in’ missing phenotypes in population-scale biobank datasets. When applied to collections of phenotypes measured across ~300,000 individuals from the UK Biobank, AutoComplete substantially improved imputation accuracy over existing methods. On three traits with notable amounts of missingness, we show that AutoComplete yields imputed phenotypes that are genetically similar to the originally observed phenotypes while increasing the effective sample size by about twofold on average. Further, genome-wide association analyses on the resulting imputed phenotypes led to a substantial increase in the number of associated loci. Our results demonstrate the utility of deep learning-based phenotype imputation to increase power for genetic discoveries in existing biobank datasets.
more » « less
Methylation risk scores are associated with a collection of phenotypes within electronic health record systems

https://doi.org/10.1038/s41525-022-00320-1

Thompson, Mike; Hill, Brian L.; Rakocz, Nadav; Chiang, Jeffrey N.; Geschwind, Daniel; Sankararaman, Sriram; Hofer, Ira; Cannesson, Maxime; Zaitlen, Noah; Halperin, Eran (December 2022, npj Genomic Medicine)

Abstract Inference of clinical phenotypes is a fundamental task in precision medicine, and has therefore been heavily investigated in recent years in the context of electronic health records (EHR) using a large arsenal of machine learning techniques, as well as in the context of genetics using polygenic risk scores (PRS). In this work, we considered the epigenetic analog of PRS, methylation risk scores (MRS), a linear combination of methylation states. We measured methylation across a large cohort ( n = 831) of diverse samples in the UCLA Health biobank, for which both genetic and complete EHR data are available. We constructed MRS for 607 phenotypes spanning diagnoses, clinical lab tests, and medication prescriptions. When added to a baseline set of predictive features, MRS significantly improved the imputation of 139 outcomes, whereas the PRS improved only 22 (median improvement for methylation 10.74%, 141.52%, and 15.46% in medications, labs, and diagnosis codes, respectively, whereas genotypes only improved the labs at a median increase of 18.42%). We added significant MRS to state-of-the-art EHR imputation methods that leverage the entire set of medical records, and found that including MRS as a medical feature in the algorithm significantly improves EHR imputation in 37% of lab tests examined (median R 2 increase 47.6%). Finally, we replicated several MRS in multiple external studies of methylation (minimum p -value of 2.72 × 10 −7 ) and replicated 22 of 30 tested MRS internally in two separate cohorts of different ethnicity. Our publicly available results and weights show promise for methylation risk scores as clinical and scientific tools.
more » « less
Full Text Available
Cross-trait assortative mating is widespread and inflates genetic correlation estimates

https://doi.org/10.1126/science.abo2059

Border, Richard; Athanasiadis, Georgios; Buil, Alfonso; Schork, Andrew J.; Cai, Na; Young, Alexander I.; Werge, Thomas; Flint, Jonathan; Kendler, Kenneth S.; Sankararaman, Sriram; et al (November 2022, Science)

Some correlations between human traits can be explained by cross-trait assortative mating and not purely genetics.
more » « less
Full Text Available
Leveraging genomic diversity for discovery in an electronic health record linked biobank: the UCLA ATLAS Community Health Initiative

https://doi.org/10.1186/s13073-022-01106-x

Johnson, Ruth; Ding, Yi; Venkateswaran, Vidhya; Bhattacharya, Arjun; Boulier, Kristin; Chiu, Alec; Knyazev, Sergey; Schwarz, Tommer; Freund, Malika; Zhan, Lingyu; et al (December 2022, Genome Medicine)

Abstract Background Large medical centers in urban areas, like Los Angeles, care for a diverse patient population and offer the potential to study the interplay between genetic ancestry and social determinants of health. Here, we explore the implications of genetic ancestry within the University of California, Los Angeles (UCLA) ATLAS Community Health Initiative—an ancestrally diverse biobank of genomic data linked with de-identified electronic health records (EHRs) of UCLA Health patients ( N =36,736). Methods We quantify the extensive continental and subcontinental genetic diversity within the ATLAS data through principal component analysis, identity-by-descent, and genetic admixture. We assess the relationship between genetically inferred ancestry (GIA) and >1500 EHR-derived phenotypes (phecodes). Finally, we demonstrate the utility of genetic data linked with EHR to perform ancestry-specific and multi-ancestry genome and phenome-wide scans across a broad set of disease phenotypes. Results We identify 5 continental-scale GIA clusters including European American (EA), African American (AA), Hispanic Latino American (HL), South Asian American (SAA) and East Asian American (EAA) individuals and 7 subcontinental GIA clusters within the EAA GIA corresponding to Chinese American, Vietnamese American, and Japanese American individuals. Although we broadly find that self-identified race/ethnicity (SIRE) is highly correlated with GIA, we still observe marked differences between the two, emphasizing that the populations defined by these two criteria are not analogous. We find a total of 259 significant associations between continental GIA and phecodes even after accounting for individuals’ SIRE, demonstrating that for some phenotypes, GIA provides information not already captured by SIRE. GWAS identifies significant associations for liver disease in the 22q13.31 locus across the HL and EAA GIA groups (HL p -value=2.32×10 −16 , EAA p -value=6.73×10 −11 ). A subsequent PheWAS at the top SNP reveals significant associations with neurologic and neoplastic phenotypes specifically within the HL GIA group. Conclusions Overall, our results explore the interplay between SIRE and GIA within a disease context and underscore the utility of studying the genomes of diverse individuals through biobank-scale genotyping linked with EHR-based phenotyping.
more » « less
Full Text Available

Search for: All records